Goto

Collaborating Authors

 Rhode Island


Hot methane seeps could support life beneath Antarctica's ice sheet

New Scientist

Microbes living beneath Antarctica's ice sheet may survive on methane generated by geothermal heat rising from deep below Earth's surface. The discovery could have implications for assessing the potential for life to survive on icy worlds beyond Earth. "These could be hotspots for microbes that are adapted to live in these areas," says Gavin Piccione at Brown University in Rhode Island. We already know that there is methane beneath Antarctica's ice sheet.


Brown University student angers non-faculty employees by asking 'what do you do all day,' faces punishment

FOX News

Alex Shieh is a student at Brown University. He is making waves and facing charges for asking the school's non-faculty employees what they do all day. A sophomore at Brown University is facing the school's wrath after he sent a DOGE-like email to non-faculty employees asking them what they do all day to try to figure out why the elite school's tuition has gotten so expensive. "The inspiration for this is the rising cost of tuition," Alex Shieh told Fox News Digital in an interview. "Next year, it's set to be 93,064 to go to Brown," Shieh said of the Ivy League university.


Anthropic can now track the bizarre inner workings of a large language model

MIT Technology Review

It's no secret that large language models work in mysterious ways. Few--if any--mass-market technologies have ever been so little understood. That makes figuring out what makes them tick one of the biggest open challenges in science. Shedding some light on how these models work would expose their weaknesses, revealing why they make stuff up and can be tricked into going off the rails. It would help resolve deep disputes about exactly what these models can and can't do.


Directional Pruning of Deep Neural Networks

Neural Information Processing Systems

In the light of the fact that the stochastic gradient descent (SGD) often finds a flat minimum valley in the training loss, we propose a novel directional pruning method which searches for a sparse minimizer in or close to that flat region. The proposed pruning method does not require retraining or the expert knowledge on the sparsity level.


Learning to Navigate Wikipedia by Taking Random Walks, Kenneth Marino, John Schultz

Neural Information Processing Systems

A fundamental ability of an intelligent web-based agent is seeking out and acquiring new information. Internet search engines reliably find the correct vicinity but the top results may be a few links away from the desired target. A complementary approach is navigation via hyperlinks, employing a policy that comprehends local content and selects a link that moves it closer to the target. In this paper, we show that behavioral cloning of randomly sampled trajectories is sufficient to learn an effective link selection policy. We demonstrate the approach on a graph version of Wikipedia with 38M nodes and 387M edges. The model is able to efficiently navigate between nodes 5 and 20 steps apart 96% and 92% of the time, respectively. We then use the resulting embeddings and policy in downstream fact verification and question answering tasks where, in combination with basic TF-IDF search and ranking methods, they are competitive results to the state-of-the-art methods.


Learning Object Placement Programs for Indoor Scene Synthesis with Iterative Self Training

arXiv.org Artificial Intelligence

Data driven and autoregressive indoor scene synthesis systems generate indoor scenes automatically by suggesting and then placing objects one at a time. Empirical observations show that current systems tend to produce incomplete next object location distributions. We introduce a system which addresses this problem. We design a Domain Specific Language (DSL) that specifies functional constraints. Programs from our language take as input a partial scene and object to place. Upon execution they predict possible object placements. We design a generative model which writes these programs automatically. Available 3D scene datasets do not contain programs to train on, so we build upon previous work in unsupervised program induction to introduce a new program bootstrapping algorithm. In order to quantify our empirical observations we introduce a new evaluation procedure which captures how well a system models per-object location distributions. We ask human annotators to label all the possible places an object can go in a scene and show that our system produces per-object location distributions more consistent with human annotators. Our system also generates indoor scenes of comparable quality to previous systems and while previous systems degrade in performance when training data is sparse, our system does not degrade to the same degree.


Handwritten Text Recognition: A Survey

arXiv.org Artificial Intelligence

Handwritten Text Recognition (HTR) has become an essential field within pattern recognition and machine learning, with applications spanning historical document preservation to modern data entry and accessibility solutions. The complexity of HTR lies in the high variability of handwriting, which makes it challenging to develop robust recognition systems. This survey examines the evolution of HTR models, tracing their progression from early heuristic-based approaches to contemporary state-of-the-art neural models, which leverage deep learning techniques. The scope of the field has also expanded, with models initially capable of recognizing only word-level content progressing to recent end-to-end document-level approaches. Our paper categorizes existing work into two primary levels of recognition: (1) \emph{up to line-level}, encompassing word and line recognition, and (2) \emph{beyond line-level}, addressing paragraph- and document-level challenges. We provide a unified framework that examines research methodologies, recent advances in benchmarking, key datasets in the field, and a discussion of the results reported in the literature. Finally, we identify pressing research challenges and outline promising future directions, aiming to equip researchers and practitioners with a roadmap for advancing the field.


Cross-Encoder Rediscovers a Semantic Variant of BM25

arXiv.org Artificial Intelligence

Neural Ranking Models (NRMs) have rapidly advanced state-of-the-art performance on information retrieval tasks. In this work, we investigate a Cross-Encoder variant of MiniLM to determine which relevance features it computes and where they are stored. We find that it employs a semantic variant of the traditional BM25 in an interpretable manner, featuring localized components: (1) Transformer attention heads that compute soft term frequency while controlling for term saturation and document length effects, and (2) a low-rank component of its embedding matrix that encodes inverse document frequency information for the vocabulary. This suggests that the Cross-Encoder uses the same fundamental mechanisms as BM25, but further leverages their capacity to capture semantics for improved retrieval performance. The granular understanding lays the groundwork for model editing to enhance model transparency, addressing safety concerns, and improving scalability in training and real-world applications.


Scale-Insensitive Neural Network Significance Tests

arXiv.org Machine Learning

This paper develops a scale-insensitive framework for neural network significance testing, substantially generalizing existing approaches through three key innovations. First, we replace metric entropy calculations with Rademacher complexity bounds, enabling the analysis of neural networks without requiring bounded weights or specific architectural constraints. Second, we weaken the regularity conditions on the target function to require only Sobolev space membership $H^s([-1,1]^d)$ with $s > d/2$, significantly relaxing previous smoothness assumptions while maintaining optimal approximation rates. Third, we introduce a modified sieve space construction based on moment bounds rather than weight constraints, providing a more natural theoretical framework for modern deep learning practices. Our approach achieves these generalizations while preserving optimal convergence rates and establishing valid asymptotic distributions for test statistics. The technical foundation combines localization theory, sharp concentration inequalities, and scale-insensitive complexity measures to handle unbounded weights and general Lipschitz activation functions. This framework better aligns theoretical guarantees with contemporary deep learning practice while maintaining mathematical rigor.


Extracting Problem Structure with LLMs for Optimized SAT Local Search

arXiv.org Artificial Intelligence

These tools apply basic strategies that work well for random problems but miss critical patterns in structured instances. SAT encodings of real problems contain inherited patterns from graph layouts, data connections, and domain-specific rules. The transformation to Conjunctive Normal Form (CNF) obscures these patterns. Current local search methods skip these structures in favor of general approaches. This paper addresses these limitations by introducing a framework that leverages LLMs to generate local search strategies tailored to encoding structures, enabling solvers to take advantage of these patterns for improved performance. Our research addresses three questions: 1. How can LLMs analyze PySAT [Ignatiev et al., 2024] code to interpret how problem structure translates to SAT clauses? 2. How can we create local search strategies that recognize and exploit these encoding patterns?